Introduction

As part of the 2017 CRG Postdocs Retreat, Damjana Kastelic and Benedetta Bolognesi run an activity in which postdocs attending to the retreat were split into groups and were asked to write together skills/knowledge/attitude they have as well as those they would like to see on their CV. Here I put together the data from the 4 groups and present them as word clouds.

paths are relative to /users/GR/mb/jquilez

Configuration

In [11]:
%pylab inline

# load python packages
import os
import os.path
import pandas as pd
import glob
from IPython.core.display import Image
from scipy import stats
from matplotlib_venn import *
import seaborn as sns
from wordcloud import WordCloud

# matplotlib options
plt.rcParams['font.size'] = 20 
plt.rcParams['font.weight'] = 'medium' 
plt.rcParams['font.family'] = 'sans-serif' 
plt.rcParams['font.sans-serif'] = 'Arial' 
plt.rcParams['lines.linewidth'] = 2.0
plt.rcParams['legend.numpoints'] = 1
plt.rcParams['legend.frameon'] = False
plt.rcParams['savefig.bbox'] = 'tight'

# seaborn options
sns.set_context("talk", font_scale = 1.5)
sns.set_style("white")
Populating the interactive namespace from numpy and matplotlib
In [2]:
project = 'misc'
analysis = '2017-01-26_world_cloud_crg_postdoc_retreat'
PROJECT = '/Volumes/users-GR-mb-jquilez/projects/%s' % project
ANALYSIS = '%s/analysis/%s' % (PROJECT, analysis)

The data

Raw data

In [5]:
infile = '%s/figures/IMG_1526.jpg' % ANALYSIS
Image(filename = infile, width = 500)
Out[5]:
In [10]:
infile = '%s/figures/IMG_1527.jpg' % ANALYSIS
Image(filename = infile, width = 500)
Out[10]:
In [7]:
infile = '%s/figures/IMG_1528.jpg' % ANALYSIS
Image(filename = infile, width = 500)
Out[7]:
In [8]:
infile = '%s/figures/IMG_1529.jpg' % ANALYSIS
Image(filename = infile, width = 500)
Out[8]:

Processed data

The content of the figures above was transcribed into text files containing the features postodcs believe to have and those that are missing in their CVs.

Word clouds

The Wordcloud Python package was used to generate the plots below. Plus and love symbols were used on purpose to display features we have and miss in our CVs, respectively.

In [117]:
def draw_clouds(label):
    
    # read the text
    ifile = '%s/tables/%s.txt' % (ANALYSIS, label)
    text = open(ifile).read()
    
    # read the mask file
    if label == 'have':
        name = 'plus'
    elif label == 'missing':
        name = 'love'
    ifile = '%s/figures/%s.jpg' % (ANALYSIS, name)
    mask = imread(ifile)

    # Generate a word cloud image
    wordcloud = WordCloud(width = 250,
                         height = 250,
                         background_color = 'black',
                         random_state = 1,
                         scale = 2,
                         mask = mask).generate(text)
    
    # Display and save the generated image:
    plt.imshow(wordcloud)
    plt.axis("off")
    ofile = '%s/figures/%s.pdf' % (ANALYSIS, label)
    plt.savefig(ofile, facecolor = 'k', bbox_inches = 'tight', dpi = 500)
In [118]:
draw_clouds('have')
In [119]:
draw_clouds('missing')

There are more features we already have than those we miss in our CVs (59 and 43 unique words, respectively). Someone mentioned optimism as a reason for this. Personally I think it reflects instead both an order and time bias. We probably started by the category we listened first in the description of the activity (i.e. features we already have) plus the time to complete the activity was limited, so we probably invested less time in thinking about the missing attributes.

Among the things postdocs at the CRG think we are good at teaching and communicating came first, while thinking, solving, management, team, work and experiments also showed up fairly frequent. Apart from these there are several other lower-frequency features; the picture seems quite diverse indeed...

Interestingly, while we see us good at management, this is probably not enough as managing is also among the things we would like to improve. This is also compatible with us currently being good at managing X but bad at managing other issues we see as relevant. Moreover, we want to write better (I find this quite unexpected as I would say that we already write a lot as PhDs and postdocs) and improve our self-whatever (self-marketing probably as one of the qualities we shyly seek for).

In [ ]:
 
In [ ]: